SynthNet: Learning to synthesize music end-to-end

F Schimbinschi; C Walder; SM Erfani; J Bailey

Conference Proceedings

SynthNet: Learning to synthesize music end-to-end

F Schimbinschi, C Walder, SM Erfani, J Bailey

IJCAI International Joint Conference on Artificial Intelligence | IJCAI | Published : 2019

DOI: 10.24963/ijcai.2019/467

Open access

Abstract

We consider the problem of learning a mapping directly from annotated music to waveforms, bypassing traditional single note synthesis. We propose a specific architecture based on WaveNet, a convolutional autoregressive generative model designed for text to speech. We investigate the representations learned by these models on music and conclude that mappings between musical notes and the instrument timbre can be learned directly from the raw audio coupled with the musical score, in binary piano roll format. Our model requires minimal training data (9 minutes), is substantially better in quality and converges 6 times faster in comparison to strong baselines in the form of powerful text to spee..

View full abstract